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Abstract. Functional graph grammars are finite devices which generate 
the class of regular automata. We recall the notion of synchronization by 
grammars, and for any given grammar we consider the class of languages 
recognized by automata generated by all its synchronized grammars. 
The synchronization is an automaton-related notion: all grammars gen- 
erating the same automaton synchronize the same languages. When the 
synchronizing automaton is unambiguous, the class of its synchronized 
languages forms an effective boolean algebra lying between the classes 
of regular languages and unambiguous context-free languages. We addi- 
tionally provide sufficient conditions for such classes to be closed under 
concatenation and its iteration. 


1 Introduction 


An automaton over some alphabet can simply be seen as a finite or countable 
set of labelled arcs together with two sets of initial and final vertices. Such an 
automaton recognizes the language of all words labelling an accepting path, 
i.e. a path leading from an initial to a final vertex. It is well-known that fi- 
nite automata recognize the regular languages. By applying basic constructions 
to finite automata, we obtain the nice closure properties of regular languages, 
namely their closure under boolean operations, concatenation and its iteration. 
For instance the synchronization product and the determinization of finite au- 
tomata respectively yield the closure of regular languages under intersection and 
under complement. 

This idea can be extended to more general classes of automata. In this paper, 
we will be interested in the class of regular automata, which recognize context- 
free languages and are defined as the (generally infinite) automata generated by 
functional graph grammars [Ca 07]. Regular automata of finite degree are also 
precisely those automata which can be finitely decomposed by distance, as well 
as the regular restrictions of transition graphs of pushdown automata [MS 85], 
[Ca 07]. Even though the class of context-free languages does not enjoy the same 
closure properties as regular languages, one can define subclasses of context-free 
languages which do, using the notion of synchronization. 

The notion of synchronization was first defined between grammars [CH 08]. A 
grammar S is synchronized by a grammar R if for any accepting path yp of (the 
graph generated by) S, there exists an accepting path À of R with the same 
label u such that A and yw are synchronized: for every prefix v of u, the prefixes 


of à and u labelled by v lead to vertices of the same level (where the level of 
a vertex is the minimal number of rewriting steps necessary for the grammar 
to produce it). A language is synchronized by a grammar R if it is recognized 
by an automaton generated by a grammar synchronized by R. A fundamental 
result is that two grammars generating the same automaton yield the same class 
of synchronized languages [Ca 08]. This way, the notion of synchronization can 
be transferred to the level of automata: for a regular automaton G, the family 
Sync(G) is the set of languages synchronized by any grammar generating G. 
By extending the above-mentioned constructions from finite automata to gram- 
mars, one can establish several closure properties of these families of synchro- 
nized languages. The sum of two grammars and the synchronization product of 
a grammar with a finite automaton respectively entail the closure of Sync(G) 
under union and under intersection with a regular language for any regular au- 
tomaton G. The (level preserving) synchronization product of two grammars 
yields the closure under intersection of Sync(G) when G is unambiguous i.e. 
when any two accepting paths of G have distinct labels. Normalizing of grammar 
into a grammar only containing arcs and then determinizing it yields, for any 
unambiguous automaton G, the closure of Sync(G) under complement relative 
to L(G). This normalization also allows us to express Sync(G) in the case of an 
infinite degree graph G, by performing the e-closure of Sync(H) for some finite 
degree automaton H using an extra label e. A final useful normalization only 
allows the presence of initial and final vertices at level 0. It yields sufficient con- 
ditions for the closure of classes of synchronized languages under concatenation 
and its iteration. 

In Section 2, we recall the definition of regular automata. In the next section, we 
summarize known results on the synchronization of regular automata [Ca 06], 
[NS 07], [CH 08], [Ca 08]. In the last section, we present a simpler construction 
for the closure under complement of Sync(G) for unambiguous G [Ca 08] and 
present new results, especially sufficient conditions for the closure of Sync(G) 
under concatenation and its iteration. 


2 Regular automata 


An automaton is a labelled oriented simple graph with input and output vertices. 
It recognizes the set of words labelling the paths from an input to an output. 
Finite automata are automata having a finite number of vertices, they recognize 
the class of regular languages. Regular automata are the automata generated by 
functional graph grammars, they recognize the class of context-free languages. A 
key result, originally due to Muller and Schupp, identifies the regular automata 
of finite degree with the automata finitely generated by distance. 


An automaton over an alphabet (finite set of symbols) T of terminals is just a 
set of arcs labelled over T (a simple labelled oriented graph) with initial and final 
vertices. We use two symbols 4 and o to mark respectively the initial and final 
vertices. More precisely an automaton G is defined by G C TxVxV U {1,0}xV 


where V is an arbitrary set such that the set of vertices 

Ve = {sEV|daeTIteV (a,s,t)EG V (a,t,s)EeG } 
of G is finite or countable. Any triple (a,s,t) € G is an arc labelled by a from 
source s to goal t; it is identified with the labelled transition s ai t or directly 


s — t if G is understood. Any pair (c, s) € G is a coloured vertex s by c € {1,0} 
also written cs. A vertex is initial (resp. final) if it is coloured by ı (resp. o) i.e. 
Ls E G (resp. os € G). An example of an automaton is given by 


G={n“Sntl|[n>0} U{n >r |n>0} U{n 2 y"|n>0} 
U {art} 8, or in >0}U{ yt 2 y" | n>0} 
U {10, oy} Ufozr™|n>0} Wy |n>0} 
and is represented (up to isomorphism) below. 


L b bl b bl b btl 
.— o ew ew ee eo A o’ l ie’ 


o DG Xe b 
L a a a 
rs ee oe Ea 


i ee 
— o l - CF 
oO b (0) b o b o 


Figure 2.1 An automaton. 


An automaton G is thus a simple vertex- and arc-labelled graph. G has fi- 


nite degree if for any vertex s, the set { t | 3a (s =t v t= s) } of 

its adjacent vertices is finite. Recall that (so, a1, S1,- --, an, Sn) for n > 0 and 
ai an . 

So —> 81... S8n-1 —> Sn is a path from so to sn labelled by u = ay...an; 
G G 


we write so => Sn or directly so = sn if G is understood. An accepting path 
G 


is a path from an initial vertex to a final vertex. An automaton is unambiguous 
if two accepting paths have distinct labels. The automaton of Figure 2.1 is un- 
ambiguous. The language recognized by an automaton G is the set L(G) of all 
labels of its accepting paths: L(G) = {ue T* |4s,t(s => t Avs, ot eG) }. 
Note that € € L(G) if there exists a vertex s which is initial and final: ts, os € G. 
The automaton G of Figure 2.1 recognizes the language 

L(G) = {ab | 02 rem U {ab |n>0} UO ee Oo 
The languages recognized by finite automata are the regular languages over T. 
We generalize finite automata to regular automata using functional graph gram- 
mars. To define a graph grammar, we need to extend an arc (resp. a graph) toa 
hyperarc (resp. a hypergraph). Although such an extension is natural, this may 
explain why functional graph grammars are not very widespread at the moment. 
But we will see in the last section that for our purpose, we can restrict to gram- 
mars using only arcs. 
Let F be a set of symbols ranked by a mapping o : F — N associating to each 
f €F its arity o(f) > 0 such that Fa = { f € F | o(f) =n } is countable for 
every n > 0 with T C Fy andu,o E F}. 
A hypergraph G is a subset of U,,39 Fax V” where V is an arbitrary set. Any 
tuple (f,51,---;So(f)) € G, also written fsı.. .So(f), is a hyperarc of label f and 


of successive vertices $1,...,S (f)- We add the condition that the set of vertices 
Vg is finite or countable, and the set of labels Fg is finite. An arc is a hyperarc 


fst labelled by f € Fə and is also denoted by s $, t. Forn > 2, a hyperarc 
fsı. --Sn is depicted as an arrow labelled f and successively linking s1,..., Sn- 
For n = 1 and n = 0, it is respectively depicted as a label f (called a colour) on 
vertex sı and as an isolated label f called a constant. This is illustrated in the 
next figures. For instance the following hypergraph: 

G = {41,5 1,2—5,5 = 3,6 3,1,14,06, A456} 
with a,b € F> and A € F3, is represented below. 
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Figure 2.2 A finite hypergraph. 


A (coloured) graph G is a hypergraph whose labels are only of arity 1 or 2: 
Fo C F, U Fy. An automaton G over the alphabet T is a graph with a set of 
labels Fe C T U {t,o}. We can now introduce functional graph grammars to 
generate regular automata. 

A graph grammar R is a finite set of rules of the form fr ...a (7) — H where 
fa1..-% of) is a hyperarc of label f called non-terminal joining pairwise distinct 
vertices z1 #... A Xf) and H is a finite hypergraph. 

We denote by Ne the set of non-terminals of R i.e. the labels of the left hand 
sides, by TR = { f € F— Npr |3 H € Im(R), f € Fy } the terminals of R i.e. 
the labels of R which are not non-terminals, and by Fr = Nr U TR the labels 
of R. 

We use grammars to generate automata hence in the following, we may assume 
that Tr C T U {2,0}. Similarly to context-free grammars (on words), a graph 
grammar has an axiom: an initial finite hypergraph. To indicate this axiom, we 
assume that any grammar R has a constant non-terminal Z € Nr N Fo which 
is not a label of any right hand side; the axiom of R is the right hand side H of 
the rule of Z: Z — H ^ Z¢ Fx for any K € Im(R). 

Starting from the axiom, we want R to generate a unique automaton up to 
isomorphism. So we finally assume that any grammar R is functional meaning 
that there is only one rule per non-terminal: if (X, H), (Y, K) € R with X (1) = 
Y (1) then (X, H) = (Y, K). 

For any rule fx1...%(f) — H, we say that x£1,...,£o(f) are the inputs of f, 
and Vp-p] is the set of outputs of f. 

To work with these grammars, it is simpler to assume that any grammar R is 
terminal-outside [Ca 07]: any terminal arc or colour in a right hand side links 
to at least one non input vertex: H N (TrxVxxVx U TrxVx) = Í for any rule 
(X, H) € R. 

We will use upper-case letters A, B, C,... for non-terminals and lower-case letters 
a,b,c... for terminals. Here is an example of a (functional graph) grammar R: 


Figure 2.3 A (functional graph) grammar. 


For the previous grammar R, we have Nr = {Z, A, B} with Z the axiom and 

0(A) = o(B) = 3, Tr = {a, 0, +, 0} and 1, 2,3 are the inputs of A and B. 

Given a grammar R, the rewriting relation —> is the binary relation between 
R 


hypergraphs defined as follows: M rewrites into N, written M —>N, if we can 
R 


choose a non-terminal hyperarc X = Ası. ..Sp in M and a rule Axı.. .£p — H in 
R such that N can be obtained by replacing X by H in M: N = (M—X)Uh(H) 
for some function h mapping each x; to s;, and the other vertices of H injectively 


to vertices outside of M; this rewriting is denoted by M —N. The rewriting — 
R,X R,X 


of a hyperare X is extended in an obvious way to the rewriting — — of any set E 
of non-terminal hyperarcs. The complete parallel rewriting = isa , stnuieaticous 
rewriting according to the set of all non-terminal hyperarcs: “M A if M TN 


where E is the set of all non-terminal hyperarcs of M. We deot below te first 
three steps of the parallel derivation of the previous grammar from its constant 
non-terminal Z: 
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Figure 2.4 Parallel derivation for the grammar of Figure 2.3. 


Given a grammar R, we restrict any hypergraph H to the automaton [H] of its 
terminal arcs and coloured vertices: |H] = H N (TxVyxVy U {t,o} xVi). 
An automaton G is generated by R (from its axiom) if G belongs to the following 
set R” of isomorphic automata: 

R? = {UnsolHn] | Z > Ho => «-. Hn = Anti --- }- 


Note that in all generality, we need to consider hypergraphs with multiplicities. 
However using an appropriate normal form, this technicality can be safely omit- 
ted [Ca 07]. 

For instance the automaton of Figure 2.1 is generated by the grammar of Fig- 
ure 2.3. A regular automaton is an automaton generated by a (functional graph) 
grammar. Note that a regular automaton has a finite number of non-isomorphic 
connected components, and has a finite number of distinct vertex degrees. 
Another example is given by the following grammar: 


$ 1 


a b 
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A er |. 3 |: e_“_ |: e| 
a b 
. . 


o 2 


which generates the following automaton: 


recognizing the language { ucù | u € {a,b}* } where ŭ is the mirror of u. 
The language recognized by a grammar R is the language L(R) recognized by 
its generated automaton: L(R) = L(G) for (any) G € RY. This language is 
well-defined since all automata generated by a given grammar are isomorphic. 
A grammar R is an unambiguous grammar if the automaton it generates is 
unambiguous. 
There is a canonical way to generate the regular automata of finite degree which 
allows to characterize these automata without the explicit use of grammars. This 
is the finite decomposition by distance. 
The inverse Gt of an automaton G is the automaton obtained from G by 
reversing its arcs and by exchanging initial and final vertices: 

G! = {t5 s]|s : t}U{s|osEeEG}U{os|iseG}. 
So G7! recognizes the mirror of the words recognized by G. The restriction G [I 
of G to a subset I of vertices is the subgraph of G induced by I: 

Gir = GN (TxIxI U {t,0}«D. 
The distance dy(s) of a vertex s to I is the minimal length of the undirected paths 
between s and I: d7(s) = min{ jul |Jrel,r 5 (3 } with min(Q) = +00. 
GUG” 


We take a new colour # € Fı — {t,o} and define for any integer n > 0, 
Dec#(G,I) = Gig s | ar(syon} U {#8 | dr(s) =n } 

In particular Dec§ (G,I) = G U { #s |s €I }. We say that an automaton G is 

finitely decomposable by distance if for each connected component C of G there 

exists a finite non empty set I of vertices such that („>o Dec#(C, J) has a finite 

number of non-isomorphic connected components. Such a definition allows the 

characterization of the class of all automata of finite degree which are regular. 


Theorem 2.5 An automaton of finite degree is regular if and only if it is 
finitely decomposable by distance and it has only a finite number of non iso- 
morphic connected components. 


The proof is given in [Ca 07] and is a slight extension of [MS 85] (but without 
using pushdown automata). Regular automata of finite degree are also the tran- 
sition graphs of pushdown automata restricted to regular sets of configurations 
and with regular sets of initial and final configurations. In particular, regular 
automata of finite degree recognize the same languages as pushdown automata. 


Proposition 2.6 The (resp. unambiguous) regular automata recognize ex- 
actly the (resp. unambiguous) context-free languages. 


This proposition remains true if we restrict to automata of finite degree. We now 
use grammars to extend the family of regular languages to boolean algebras of 
unambiguous context-free languages. 


3 Synchronization of regular automata 


We introduce the idea of synchronization between grammars. The class of lan- 
guages synchronized by a grammar R are the languages recognized by grammars 
synchronized by R. We show that these families of languages are closed under 
union by applying the sum of grammars, are closed under intersection with a 
regular language by defining the synchronization product of a grammar with a 
finite automaton, and are closed under intersection (in the case of grammars 
generating unambiguous automata) by performing the synchronization product 
of grammars. Finally we show that all grammars generating the same automaton 
synchronize the same languages. 


To each vertex s of an automaton G € R” generated by a grammar R, we as- 
sociate a non negative integer (s) which is the minimal number of rewritings 
applied from the axiom necessary to reach s. More precisely for G = Up» Hn] 
with Z : Ho - asc Hn a Hy41..., the level £(s) of s € Vg, also written /E(s) 
to specify G and R,is €(s) = min{ n | s € Vu, }. 

We depict below the levels of some vertices of the regular automaton of Fig- 
ure 2.1 generated by the grammar of Figure 2.3. This automaton is represented 
by vertices of increasing level: vertices at a same level are aligned vertically. 


0 1 2 3 4 5 6 


Figure 3.1 Vertex levels with the grammar of Figure 2.3. 


We say that a grammar S is synchronized by a grammar R written S < R, or 
equivalently that R synchronizes S written R >œ S, if for any accepting path u 
label by u of the automaton generated by S, there is an accepting path A label 
by u of the automaton generated by R such that for every prefix v of u, the 
prefixes of A and p labelled by v lead to vertices of the same level: for (any) 
G € R” and (any) H € S” and for any to => tı... £ tn with ito, Otn € H, 
FF H 
there exists 
so => 81... sp with L80, 08, € G and £2 (s;) = 4 (ti) Vi € [0,n]. 
G G 
For instance the grammar of Figure 2.3 synchronizes the following grammar: 
ot le le le 1+. 
zZz ———— | A ; | A — | B ; | B — Da | A 
b 
Kay 7 2 Q2¢ 2 n 


Figure 3.2 A grammar synchronized by the grammar of Figure 2.3. 


In particular for S < R, we have L(S) C L(R). Note that the empty grammar 
{(Z, 0)} is synchronized by any grammar. The synchronization relation > is a 
reflexive and transitive relation. We denote ><) the bi-synchronization relation: 
RmeSifR > SandS >œ R. Note that bi-synchronized grammars R >< S 
may generate distinct automata: RY # S”. For any grammar R, the image of R 
by © is the family >(R) = { S | S < R } of grammars synchronized by R and 
Sync(R) = { L(S) | S < R } is the family of languages synchronized by R. 
Note that Sync(R) is a family of languages included in L(R) and containing the 
empty language and L(R). Note also that Sync(R) = Sync(S) for Rox S. 
Standard operations on finite automata are extended to grammars in order to 
obtain closure properties of Sync(R). For instance the synchronization product 
of finite automata is extended to arbitrary automata G and H by 
GxH = { (s,p) = (ta) |s >t A p—>q} 
Uf{is,p)|esEeG A upE€H} U {o(s,p)|osEeG A ope H } 
which recognizes L(GxH) = L(G) N L(H). 
This allows us to define the synchronization product RxK of a grammar R 
with a finite automaton K [CH 08]. Let {q1,..., qn} be the vertex set of K. To 
each A € Np, we associate a new symbol (A,n) of arity e(A)xn except that 
(Z,0) = Z, and to each hyperarc Ary... with m = @(A), we associate the 
hyperarc (Arı. . ‘Tm)K = (A, n)(r1, qı). : (rı, qn). : (Tm, qı). $ (Tm, qn). 
The grammar RxK associates to each rule (X, H) € R the following rule: 
Xx — [H]xK U {(BY)x |BYEH A BE Np}. 


Example 3.3 Let us consider the following grammar R: 


L A A a 
Zz —_ N *s 3 . Stl e 
o j 1 6 t 


generating the following (regular) automaton G: 


L a a a 
. > e —— . —— .. ee ee 


Oo b b b 


and recognizing the restricted Dyck language D{* over the pair (a,b) [Be 79]: 
L(R) = L(G) = Df. We consider the following finite automaton K : 


recognizing the set of words over {a,b} having an even number of a. 
So RxK is the following grammar: 


t bo (pP) 
(sp) *6 (Lp) ELp) Vaip 
zZz Ż ——— |» ; [ea — Xo 
b 
(s,a) ¢ (1,4) ¢ (1,a)¢ * 
(t;a) 


generating the automaton Gx K : 


which recognizes D{* restricted to the words with an even number of a. 


The synchronization product of a grammar R with a finite automaton K is 
synchronized by R i.e. Rx kK < Rand recognizes L(RxK) = L(R) A L(K). 


Proposition 3.4 For any grammar R, the family Sync(R) is closed under 
intersection with a regular language. 


Propositions 2.6 and 3.4 imply the well-known closure property of the family 
of context-free languages under intersection with a regular language. As Rx kK 
is unambiguous for R unambiguous and K deterministic, it also follows Theo- 
rem 6.4.1 of [Ha 78]: the family of unambiguous context-free languages is closed 
under intersection with a regular language. 
Another basic operation on finite automata is the disjoint union. This opera- 
tion is extended to any grammars Rı and Rə. For any i € {1,2}, we denote 
Ri=Rx({i“i|aeT }U {i, oi}) in order to distinguish the vertices of 
Rı and Rə. For (Z, Hı) € Ri and (Z, H2) € RS, the sum of Ri and Rə is the 
grammar 

Ri + Ry = {(Z, Hı U Ha)} U (R4 — {(Z, H1)}) U (R, — {(Z, H2)}). 
So (Ri + Ro)” = { Gi U G2 | Gy E€ RY A Go E€ RY /\ Vea, N Ve, = 0 } hence 
L(Rı + Re) = L(Rı) U L( Re). In particular if Sı < Rı and S2 < Rə then 
Si, +S. < Ri +R. 


Proposition 3.5 For any grammar R, Sync(R) is closed under union. 


The synchronization product of regular automata can be non regular. Further- 
more for the regular automaton G: 


the languages { aba” | m,n > 0 } and { a’b"a” | m,n > 0 } are in Sync(G) 
but their intersection { a”b”a” | n > 0 } is not a context-free language. 
The synchronization product of a grammar with a finite automaton is extended 
for two grammars R and S for generating the level synchronization product 
Gx¢eH of their generated automata ŒG € RY and H €e S” which is the re- 
striction of GxH to pairs of vertices with same level: GxeH = (GxH)\p for 
P = { (s,p) € VaxVu | €&(s) = £$ (p) }. This product can be generated by 
a grammar Rx,S that we define. Let (A,B) € NrxNs be any pair of non- 
terminals and Æ C [1, e(A)]x{1, e(B)] be a binary relation over inputs such 
that for all 7,7 € [1,0(A)], if E(i) N E(j) # Ø then E(t) = E(j), where 
E(t) = {j | (i,j) € E} denotes the image of i € [1,o(A)] by E. Intuitively 
for a pair (A, B) E€ NrxNs of non-terminals, a relation E C [1, e(A)]x[1, o(B)] 
is used to memorize which entries of A and B are being synchronized. 
To any such A, B and E, we associate a new symbol [A, B, E] of arity |E| 
(where [Z, Z,0] is assimilated to Z). To each non-terminal hyperare Ary...rm 
of R (A € Npr and m = o(A)) and each non-terminal hyperarc Bs,...s, of S 
(B € Ng and n = ọ(B)), we associate the hyperarc 
[Ari . rm, Bs1...8n,E] = [A, B, E](ri, $1), .-- (71, Sn)p e (Tm; $1), +++ (Tims $n), 
with (rsj), = (ra sj) if (i,j) € E, and £ otherwise. The grammar Rx,S is 
then defined by associating to each (AX, P) € R, each (BY,Q) € S, and each 
E C [0(A)]x[o(B)], the rule of left hand side [AX, BY, E] and of right hand side 
([P]x[Q]) = U {[CU, DV, E] | CUE PACE NRADV EQADE Ng} 
with E = { (X (i), Y()) | (i, j) € E } U (Vp — Vx) x (Vo — Vy) and 
E' = { (i,j) € [o(C)]x[o(D)] | (UC), VG) € E }. 
Note that Rx ¢S' is synchronized by R and S, and is bi-synchrnonized with S for 
S < R. Furthermore Rx S generates GxeH for G € RY and H € S“ hence rec- 
ognizes a subset of L(R) N L(S). However for grammars S and S” synchronized 
by an unambiguous grammar R, we have L(Sx,S’) = L(S) A L(S"). 


Proposition 3.6 For any unambiguous grammar R, the family Sync(R) is 
closed under intersection. 


By extending basic operations on finite automata to grammars, it appears that 
graph grammars are to context-free languages what finite automata are to reg- 
ular languages. We will continue these extensions in the next section. Let us 
present a fundamental result concerning grammar synchronization, which states 
that Sync(R) is independent of the way the automaton R” is generated. 


Theorem 3.7 For any grammars R and S such that RY = S”, we have 
Sync(R) = Sync(S). 


Proof sketch. 
By symmetry of R and S, it is sufficient to show that Sync(R) C Sync(S). 
Let R’ < R. We want to show that L(R’) € Sync(S). 
We have to show the existence of S’ < S such that L(S’) = L(R’). 
Note that it is possible that there is no grammar S’ synchronized by S and 
generating the same automaton as R’ (i.e. S’ < S and S” = R”), 
Let G € R” = S”. Any vertex s of G has a level @2(s) according to R and a 
level £2(s) according to S. 
Let H € R" and let K = (Gx¢H)|p be the automaton obtained by level synchro- 
nization product of G with H and restricted to the set P of vertices accessible 
from ų and co-accessible from o. 
The restriction by accessibility from ų¿ and co-accessibility from o can de done 
by a bi-synchronized grammar [Ca 08]. By definition of Rx;R’, the automaton 
K can be generated by a grammar R” bi-synchronized to R’ with 
LR” (s, p) = €8(s) = 1E (p) for every (s,p) € Vg. 
In particular L(K) = L(R’). 
Let us show that K is generated by a grammar synchronized by S. 
We give the proof for RY of finite degree. In that case and for || o || = X yen, QCA); 
28 (s) — LR (t)| < |lell-de(s,t) for every s,t€ Va. 
Furthermore K is also of finite degree. 
We show that K is finitely decomposable not by distance but according to £% (s) 
for the vertices (s, p) of K. 
Let n > 0 and C be a connected component of KY; (s pjeve | £3,(s)>n}- 
So C is fully determined by 
its frontier : Frg(C) = Ve N Vg-c 
its interface : Intg (C) = { s Fi t| {s,t} N Fre(C) #0}. 
Let (so, po) € Frg (C) and D be the connected component of G'{ s |23 (s)>n } CON- 
taining so. It remains to find a bound b independent of n such that 
ER (s,p) — eR (t,q)| <b for every (s, p), (ta) € Frr (C). 
For any (s, p), (t,q) € Frg (C), we have s,t € Fra(D) hence dp(s,t) is bounded 
by the integer 
c = max{ dgua)(i,j) < +œ | AE Ng A 1,7 € [1, o(A)] } 
whose S*(A) = { UnsolHn] | Al..-0(A) = Ho So Hp, = Anii... } 
thus it follows that 
IR (s, p) —  (t.9)| = G(s) — O] < llellda(s,t) < Il elldv(s,t) < lle le. 
For G of infinite degree and by Proposition 4.9, we can express Sync(G) as an 
e-closure of Sync(H) for some regular automaton H of finite degree using €- 
transitions. 


Theorem 3.7 allows to transfer the concept of grammar synchronization to the 
level of regular automata: for any regular automaton G, we can define 


Sync(G) = Sync(R) for (any) R such that G € R”. 
Let us illustrate these ideas by presenting some examples of well-known sub- 
families of context-free languages obtained by synchronization. 


Example 3.8 For any finite automaton G, Sync(G) is the family of regular 
languages included in L(G). 


Example 3.9 For the following regular automaton G: 


a a 


= oe, * 
b 


Sync(G) is the family of input-driven languages [Me 80] with a pushing, b pop- 
ping and c internal. As the initial vertex is not source of an arc labelled by b, 
Sync(G) does not contain all the regular languages. 


Example 3.10 We complete the previous automaton by adding an b-loop on 
the initial vertex to obtain the following automaton G: 


Oe Oe OO 


—- ie ar — + 
b 


The set Sync(G) is the family of visibly pushdown languages [AM 04] with a 
pushing, b popping and c internal. 


Example 3.11 For the following regular automaton G: 


a 
OSA 


ANAN. * GAR, 


a = Z 2 i 1 
l i i L l L l I 
i l l i l i i i 
the set Sync(G) is the family of balanced oe [BB 02] with a,b pushing 
with their corresponding popping letters @, b, and c is internal. 


Example 3.12 For the grammar R of Figure 3.2, Sync(R) is the family of 

languages generated by the following linear contex-free grammars: 
I=P+a™A(b+...+b™) with m>Oand PC { a’) |L<j<i<m} 
A=Q+a"A(b+...+56") withn>0 andQC{ atl) |l1<j<i<n}. 


For each regular automaton G among the previous examples, Sync(G) is a 
boolean algebra according to L(G) and, except for the last two examples, is 
also closed under concatenation and its iteration. We now consider new closure 
properties of synchronized languages for regular automata. 


4 Closure properties 


We have seen that the family Sync(G) of languages synchronized by a regu- 
lar automaton G is closed under union and under intersection with a regular 
language, and under intersection when G is unambiguous. In this section, we 
consider the closure of Sync(G) under complement relative to L(G) and un- 
der concatenation and its transitive closure. To obtain these closure properties, 
we first apply grammar normalizations preserving the synchronized languages. 
These normalizations also allow us to add e-arcs to any regular automaton to 
get a regular automaton of finite degree with the same synchronized languages. 


First we put any grammar in an equivalent normal form with the same set of 
synchronized languages. As in the case of finite automata, we transform any 
automaton G into the pointed automaton G! which is language equivalent 
L(G) = L(G), with a unique initial vertex T ¢ Va which is goal of no arc 
and can be final, and with a unique non initial and final vertex L ¢ Vg which is 
source of no arc: 

GI =(G—-{t,0}xVg) U {4T,o1} U {oT |ds(s,08€G)} 
U{T—St|ds(s—t A ts EG)} 
U{s—S1l|it(s——t A ot €G)} 

U{T > 1]4is,t(s—t A ts,oteEG)}. 


For instance, the finite degree řesulor automaton G of Te 2.1 is transformed 
into the following infinite degree regular automaton G | 


Figure 4.1 A pointed regular automaton. 


Note that if G is unambiguous, G] remains unambiguous. The pointed trans- 
formation of a regular automaton remains a regular automaton which can be 
generated by an 0-grammar: only the axiom has initial and final vertices. Let 
R be any grammar and T, L be two symbols which are not vertices of R. Let 


G € R” with T, L ¢ Va. We define an 0-grammar R! generating G] and pre- 
serving the synchronized languages: Sync(RT) = Sync(R). 


First we transform R into a grammar R in which we memorize in the non- 
terminals the input vertices which are linked to initial or final vertices of the 
generated automaton. More precisely to any A € Np and I,J C [1, e(A)], we 
associate a new symbol Ay, ; of arity o( A) with Z = Z 9. We define the grammar 


R assciating to each (AX, H) € R and I,J C [1, o(A)] the following rule: 
Ar JX — [H] U { Br yY | BYEH ^ BeENgR} 
with ={i] YEI vYli)eH}and J'={j|YG)EJVoY(G)EH} 
and we restrict the rules of R to the non-terminals accessible from Z. 
Note that the set L(R) N T of letters recognized by R can be determined as 
{a| 3 (ArJX,H)ceR (Aiel4t, X(i) -t A ot € H) 


V(jeTJds, so X(j) NuseEH) Vv Gst, s=>t A us,ot eH) } 
[H] H 


and £ € L(R) = JH €Im(R) 4s (s,o0s€ H). 
To any A € Np — {Z} and any I,J C [1, o(A)], we associate a new symbol Af z 
of arity o(A) + 2, and we define the grammar R] containing the axiom rule 

Z — Hog U {T,o01}U{0T|eEL(R)}u{T >lL|aeL(R) AOT} 
for (Z, H) € R, and for any (Az, zX, H) € R with A # Z, we take in R] the rule 
Ar JT XL — H7,; such that Hz, is the following hypergraph: 

Hrg = ([H] = {4,0})x VH) U { Bpo XL BpoX EHA Beg E Ng } 

U{T>t|aier X0 -t VistseH A s—>t)} 


U{s > L|ajEJ(s -> XG) v dt(oteH ^ s- 0) 
H H 


and we put R] into a terminal-outside form [Ca 07]. 


Example 4.2 Let us consider the following grammar R: 


$ A A B 
Zz — . 3 . SS 
o 1 1 
B a Cc a 
+ ——— E- c ; . el =— ^ 
1 o 1 1 b o 


` 
---e--- O 


© 


S Airi Bi 
Zz eo . > 3 ° eo . 
o 
Bit a Co 9,1 a A 
. — e E 3 ° — =——_ 1,1 
b b 


In particular ¢,a,b € L(R). Then R is transformed into the grammar R! : 


T ¢ Te Te 
zZz — » a,b(+( 4), ; 1 Jaia — pa 
E i Le Le 


T» T» Te Te 
4 
fa a : i a: 
1¢(Bi 4 — i=. Con 3 164Cg, —S 1 ==: Aia 
a a 
| | 
LS Ls a e t¢ 


that we put in a terminal-outside form: 


Lo 
T » Te Te 

ZL —_ a(r H po —r Bo 
1g ie Le» 


So R! generates G| : 


a,b a,b 
a a a a 
a, b | ——[_ a ‘e E — G’ —-—C —  — 
b b 
a a a a 


0 


The grammars R and R! synchronize the same languages. 


Proposition 4.3 For any regular automaton G with T, L ¢ Va, the pointed 


automaton G] remains regular and Sync(G |) = Sync(G). 


It follows that, in order to define families of languages by synchronization by 
a regular automaton G, we can restrict to pointed automata G. A stronger 
normalization is to transform any grammar R into a grammar S such that 
Sync(S) = Sync(R) and S is an arc-grammar in the following sense: S is an 
0-grammar whose any non-terminal A € Ng — {Z} is of arity 2, and for any non 
axiom rule Ast — H, there is no arc in H of goal s or of source t: for any 
P— 4, we have p Æ t and q £ s$. 


We can transformed any 0-grammar R into a bi-synchronized arc-grammar < R>. 
We assume that each rule of R is of the form Al...0(A) — H4 for any AE Nr. 


We take a new symbol 0 (not a vertex of R) and a new label A; j of arity 2 for 
each A € Np and each i,j € [1, o(A)] in order to generate paths from 7 to j in 
R” (A1...0(A)). We define the splitting <G> of any Fr-hypergraph G without 
vertex 0 as being the graph: 
4G> = |G] U { X(i) S XU) | AX EG A AENR A i,j €[o(A)] } 
and for p,q € Vg and P C Vg with 0 ¢ Va, we define 
Gp.pqg =({ 8 —>t|tAp A s#q A st P Y) for p#q 
Gppp=({s>t|t#p A s,t¢P}uU {s=>0|s Ps 
<G> xG> 
with I = { s | p= s == q } and J = { s | p = s = 0}. 
This allows to define the splitting <R> of R as being the following arc-grammar: 
Z — <~Hz> 
Ai j12 —> hig ((HA)i loca) -tisha for each A € Nr and i,j (< [1, 0(A)] 
where h; j is the vertex renaming defined by 
hi j(i) = 1, hij (9) = 2, hij (x) = T otherwise, for i#j 
hi (i) = 1, hi (0) = 2, hi il£) =x otherwise. 
Thus R and <R> are bi-synchronized, and <R> is unambiguous when R is 
unambiguous. Note that we can put <R> into a reduced form by removing any 
non-terminal A; j such that <R>=“ (A; j)12 is without path from 1 to 2. 


Example 4.4 The following 0-grammar R: 


The splitting <R> of R is the following grammar: 


t eDi le 1—1. le Lee DAL 
Ze ee | 2 $ [a —— S š |z — L 
o’ 2% Q¢ Q¢ 2 


a 
_—_ 1 


« 1 « 
ae —_ prs 3 |22 —— ie 
° Q¢ ° . 


generating the following automaton: 


1 


2 


As R œ< <R>, we have Sync(R) = Sync(<~R>). 
L 


To study closure properties of Sync(R) for any grammar R, we can work with its 
normal form <R | > which is an arc-grammar generating a pointed automaton. 
This normalization is really useful to study the closure property of Sync(R) 
under complement relative to L(R), under concatenation and its iteration. 

We have seen that Sync(R) is not closed in general under intersection, hence it 
is not closed under complement according to L(A) since for any L,M C L(R), 
L A M = L(R) — [(L(R) — L) U (L(R) — M)]. For R unambiguous, Sync(R) is 
closed under intersection, and this remains true under complement according to 
L(R) [Ca 08]. We give here a simpler construction. 

As <R! > remains unambiguous, we can assume that R is an arc-grammar. Let 
S < R. We want to show that L(R)—L(S) € Sync(R). So S is an 0-grammar and 
S is level-unambiguous as defined in [Ca 08]. Thus <S> is a level-unambiguous 
arc-grammar. We take a new colour c € F; — {t,o} and for any grammar S’, we 
denote S/, (resp. S£) the grammar obtained from S” by replacing the final colour 
o by c (resp. c by 0). So R+ <S>. is an arc-grammar and (R+~<S>-)z is level- 
unambiguous. It remains to apply the grammar determinization in [Ca 08] to 
get the grammar R/S = Det(R+~<S>,) such that (R/S)z is unambiguous and 
bi-synchronized to (R+~<S>-<)z. Finally we keep in R/S the final vertices which 
are not coloured by c to obtain a grammar synchronized by R and recognizing 
L(R) — L(S). 


Theorem 4.5 For any unambiguous regular automaton G, the set Sync(G) 
is an effective boolean algebra according to L(G), containing all the regular 
languages included in L(G). 


So we can decide the inclusion L(S) C L(S’) for two grammars S and S’ synchro- 
nized by a common unambiguous grammar. Furthermore for grammars Rı and 
Rə such that Rı + Rz is level-unambiguous, Sync(Ri + R2) = { Lı U Lə | L4 € 
Sync(Ri) A Lə € Sync(R2) } is a boolean algebra included in L(R1) U L(Rə), 
containing Sync(Rı) and Sync(R2). 

The automata of Examples 3.8 to 3.12 are unambiguous hence their families of 
synchronized languages are boolean algebra. This regular automaton G: 


—O 


TEPE el eg ee eee we 


= LL 


—— C:-=—— -=—_- 


is 2-ambiguous: there are two accepting paths for the words a”b"a” with n > 0 
and a unique accepting path for the other accepted words. But Sync(G) is not 
closed under intersection since { aba” | m,n > 0} and { a™b"a” | m,n > 0 } 
are languages synchronized by G. 


For any regular automaton G, the closure of Sync(G) under concatenation - 
(resp. under its transitive closure *) does not require the unambiguity of G. 
As L(G) € Sync(G), a necessary condition is to have L(G).L(G) € Sync(G) 
(resp. L(G)* € Sync(G)). Note that this necessary condition implies that L(G) 
is closed under - (resp. +). In particular Sync(G) is not closed under - and + 
for the automata of Examples 3.11 and 3.12. But this necessary condition is not 
sufficient since the following regular automaton G: 


recognizes L(G) = e+ M(a+b)* for M = { ab” | n > 0 }, hence L(G).L(G) = 
L(G) = L(G)* but M € Sync(G) and M.M, M+ ¢ Sync(G). 

Let us give a simple and general condition on a grammar R such that Sync(R) 
is closed under - and +. We say that a grammar is iterative if any initial vertex 
is in the axiom and for (any) Œ € R” and any accepting path so zn S1... = Sn 
with L50, OSn € G and for any final vertex t i.e. ot € G, there exists a path 
t => tı. as tn with ot, E€ G such that ¢(t;) = L(t) + (si) for all i € [1, n]. 


For instance: the automaton of Example 3.9 can be generated by an iterative 
grammar. And any 0-grammar generating a regular automaton having a unique 
initial vertex which is the unique final vertex, is iterative. Standard constructions 
on finite automata for the concatenation and its iteration can be extended to 
iterative grammars. 


Proposition 4.6 For any iterative grammar R, the family Sync(R) is closed 
under concatenation and its transitive closure. 


However the automaton G of Example 3.10 cannot be generated by an iterated 
grammar but Sync(G) is closed under - and +. We can also obtain families of 
synchronized languages which are closed under - and + by saturating grammars. 
The saturation G* of an automaton G is the automaton 

Gt = GU {s 5r|ireG dti(s—t A ot €G)} 


recognizing L(G) = (L(G))*. 


Note that if G is regular with infinite sets of initial and final vertices, GT can 
be non regular (but is always prefix-recognizable). If G is generated by an 0- 
grammar R, its saturation G can be generated by a grammar R+ that we 
define. 
Let (Z, H) be the axiom rule of R and ri,...,rp be the initial vertices of H ; we 
can assume that r1,..., rp are not vertices of R-{(Z, H)}. To each A € Nr—{Z} 
and I C [1, e(A)], we associate a new symbol A; of arity e(A) + p and we define 
R* with the following rules: 
Z — [H]* Dpi APS XE (AT in -fp | AX €H A AEC Nr} 
ArXrı...Tp — Kr for each (AX, K) € Rand AF Z and I C [1, o(A)] 
whose Ky, is the automaton obtained from K as follows: 
ky =[K] uU {s 7; |j€[p] A diel (s= X(i)) } 
Uf Bija i€l, YG)=X(i)}Y T1. - -Tp | BYEK A BENR}. 
So R is synchronized by Rt and G* € (R*)” for G € R°. 
To characterize Sync(R*) from Sync(R), we define the regular closure Reg(E) 
of any language family E as being the smallest family of languages containing 
E and closed under U,:, +. 


Proposition 4.7 For any 0-grammar R, Sync(Rt) = Reg(Sync(R)). 


By Propositions 4.3, 4.6 and 4.7, the following regular automaton G: 


has the same synchronized languages than the automaton of Example 3.9: 
Sync(G) is the family of input-driven languages (for a pushing, b popping and 
c internal). By adding an b-loop on the initial (and final) vertex of G, we obtain 
an automaton H such that Sync(H) is the family of visibly pushdown languages 
hence by Proposition 4.7, is closed under - and *. 


Example 4.8 A natural extension of the visibly pushdown languages is to add 
reset letters. For a pushing, b popping and c internal, we add a reset letter d to 
define the following regular automaton G : 


b,c,d 


(gig EEES 


e ————— o ———— > — — 
wd i ae b 
a 


d 


Any language of Sync(G) is a visibly pushdown language taking d as an internal 
letter, but not the converse: { a”db” | n > 0 } Z Sync(G). By Theorem 4.5, 
Sync(G) is a boolean algebra. Furthermore the following automaton H : 


a,b,c,d 


QobS6 nO I - 


ee e ———— ——* =» 
Wad ae, nee 
a,b,c,d 


a,b,c,d 


satisfies Sync(H) = Sync(G) and H* = H hence by Proposition 4.7, Sync(G) 
is also closed under - and +. 


0 


Note that the automata of the previous example have infinite degree. Further- 
more for any automaton G of finite degree having an infinite set of initial or 


final vertices, the pointed automaton G | is of infinite degree. However any reg- 
ular automaton of infinite degree (in fact any prefix-recognizable automaton) 
can be obtained by e-closure from a regular automaton of finite degree using 
e-transitions. For instance let us take a new letter e ¢ T (instead of the empty 
word) and let us denote Te the morphism erasing e in the words over T U {e}: 
Tela) = a for any a € T and m-(e) = £, that we extend by union to any language 
LC (T U {e})*: me(L) = { me(u) | u € L }, and by powerset to any family P of 
languages: 7-(P) = { m-(L) | L € P }. The following regular automaton K : 


b,c, d 


Talali an 


ie e r 


D 


«=< e 


is of finite degree and satisfies 7.(Sync(K)) = Sync(G) for the automaton G 
of Example 4.8. Let us give a simple transformation of any grammar R to a 
grammar Re such that RY is of finite degree and m,.(Sync(R.)) = Sync(R). 

As Sync(R) = Syne(<R' >), we restrict this transformation to arc-grammars. 
Let R be an arc-grammar. We define Re to be an arc-grammar obtained from R 
by replacing each non axiom rule Ast — H by the rule: 

Ast — ([H] U {s = se, te +t} U h(H — [H])) p 

with s.,t. be new vertices and h the vertex mapping defined for any r € Vy 
by h(r) =r ifr g {s,t}, h(s) = se and A(t) = te, and P is the set of vertices 
accessible from s and co-accessible from t. For instance the arc-grammar R 


le A 

N 

Zz — A 3 A — . 
b, 

Ys 
*o 2 


is transformed into the following arc-grammar Re : 


of 1% LoS. 


A 

A — A 3 A — NJ 
yya 

to 2% 2N 


For any rule of Re , the inputs are separated from the outputs (by e-transitions), 
hence RY is of finite degree. Furthermore this transformation preserves the syn- 
chronized languages. 


Proposition 4.9 For any arc-grammar R, Sync(R) = me(Sync(R-)). 


So for any R, Sync(R) = 7-(Sync(<R{ =e)) and (xR] =e)” is of finite degree. 


All the constructions given in this paper are natural generalizations of usual 
transformations on finite automata to graph grammars. In this way, basic clo- 
sure properties could be lifted to sub-families of context-free languages. 


Conclusion 


The synchronization of regular automata is defined through devices generating 
these automata, namely functional graph grammars. It can also be defined us- 
ing pushdown automata with ¢-transitions [NS 07] because Theorem 3.7 asserts 
that the family of languages synchronized by a regular automaton is indepen- 
dent of the way the automaton is generated; it is a graph-related notion. This 
paper shows that the mechanism of functional graph grammars provides natural 
constructions on regular automata generalizing usual constructions on finite au- 
tomata. This paper is also an invitation to extend the notion of synchronization 
to more general sub-families of automata. 


Acknowledgements 


Many thanks to Arnaud Carayol and Antoine Meyer for helping me prepare the 
final version of this paper. 


References 


AM 04] R. ALUR and P. MADHUSUDAN  Visibly pushdown languages, 36*” STOC, 
ACM Proceedings, L. Babai (Ed.), 202-211 (2004). 

Be 79] J. BERSTEL Transductions and context-free languages, Ed. Teubner, pp. 1- 
278, 1979. 

BB 02] J. BERSTEL and L. Boasson Balanced grammars and their languages, Formal 
and Natural Computing, LNCS 2300, W. Brauer, H. Ehrig, J. Karhumäki, A. Salomaa 
(Eds.), 3-25 (2002). 

Ca 06] D. CAUCAL Synchronization of pushdown automata, 10°" DLT, LNCS 4036, 
O. Ibarra, Z. Dang (Eds.), 120-132 (2006). 

Ca 07] D. CaucaL Deterministic graph grammars, Texts in Logic and Games 2, 
Amsterdam University Press, J. Flum, E. Gradel, T. Wilke (Eds.), 169-250 (2007). 

Ca 08] D. CaucaL Boolean algebras of unambiguous context-free languages, 28°” 
FSTTCS, Dagstuhl Research Online Publication Server, R. Hariharan, M. Mukund, 
V. Vinay (Eds.) (2008). 


CH 08] D. CaucaL and S. HAssEN Synchronization of grammars, 3"? CSR, 
LNCS 5010, E. Hirsch, A. Razborov, A. Semenov, A. Slissenko (Eds.), 110-121 (2008). 

Ha 78] M. HARRISON Introduction to formal language theory, Addison-Wesley (1978). 

Me 80] K. MEHLHORN Pebbling mountain ranges and its application to DCFL recog- 
nition, 7” ICALP, LNCS 85, J. de Bakker, J. van Leeuwen (Eds.), 422-432 (1980). 

MS 85] D. MULLER and P. ScHupP The theory of ends, pushdown automata, and 
second-order logic, Theoretical Computer Science 37, 51-75 (1985). 

NS 07] D. NoworKa and J. SRBA  Height-deterministic pushdown automata, 
32"¢ MFCS, LNCS 4708, L. Kucera, A. Kucera (Eds.), 125-134 (2007). 


